Fast and Accurate Phonetic Spoken Term Detection

نویسنده

  • Roy Wallace
چکیده

For the first time in human history, large volumes of spoken audio are being broadcast, made available on the internet, archived, and monitored for surveillance every day. New technologies are urgently required to unlock these vast and powerful stores of information. Spoken Term Detection (STD) systems provide access to speech collections by detecting individual occurrences of specified search terms. The aim of this work is to develop improved STD solutions based on phonetic indexing. In particular, this work aims to develop phonetic STD systems for applications that require open-vocabulary search, fast indexing and search speeds, and accurate term detection. Within this scope, novel contributions are made within two research themes, that is, accommodating phone recognition errors and, secondly, modelling uncertainty with probabilistic scores. A state-of-the-art Dynamic Match Lattice Spotting (DMLS) system is used to address the problem of accommodating phone recognition errors with approximate phone sequence matching. Extensive experimentation on the use of DMLS is carried out and a number of novel enhancements are developed that provide for faster indexing, faster search, and improved accuracy. Firstly, a novel comparison of methods for deriving a phone error cost model is presented to improve STD accuracy, resulting in up to a 33% improvement in the Figure of Merit. A method is also presented for drastically increasing the speed of DMLS search by at least an order of magnitude with no loss in search accuracy. An investigation is then presented of the effects of increasing indexing speed for DMLS, by using simpler modelling during phone decoding, with results

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonetic recoding of phonologically ambiguous printed words.

Speech detection and matching simultaneously presented printed and spoken words were used to examine phonologic and phonetic processing of Hebrew heterophonic homographs. Subjects detected a correspondence between an ambiguous letter string and the amplitude envelopes of both dominant and subordinate phonological alternatives. Similar effects were obtained when the homographs were phonologicall...

متن کامل

Fast decoding for open vocabulary spoken term detection

Information retrieval and spoken-term detection from audio such as broadcast news, telephone conversations, conference calls, and meetings are of great interest to the academic, government, and business communities. Motivated by the requirement for high-quality indexes, this study explores the effect of using both word and sub-word information to find in-vocabulary and OOV query terms. It also ...

متن کامل

A phonetic search approach to the 2006 NIST spoken term detection evaluation

This paper details the submission from the Speech and Audio Research Lab of Queensland University of Technology (QUT) to the inaugural 2006 NIST Spoken Term Detection Evaluation. The task involved accurately locating the occurrences of a specified list of English terms in a given corpus of broadcast news and conversational telephone speech. The QUT system uses phonetic decoding and Dynamic Matc...

متن کامل

Phonetic query expansion for spoken document retrieval

We are interested in retrieving information from speech data using phonetic search. We show improvement by expanding the query phonetically using a joint maximum entropy N-gram model. The value of this approach is demonstrated on Broadcast News data from NIST 2006 Spoken Term Detection evaluation.

متن کامل

Addressing the out-of-vocabulary problem for large-scale Chinese spoken term detection

While the Out-Of-Vocabulary (OOV) problem remains a challenge for English spoken term detection tasks, it is underestimated for Chinese. This is because an Chinese OOV query term can still be matched as a sequence of Chinese characters, with each character itself being a word in the vocabulary. However, our experiments show that search accuracy levels differ significantly when a query is or is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010